34 research outputs found

    Competitive Parallel Disk Prefetching and Buffer Management

    Get PDF
    We provide a competitive analysis framework for online prefetching and buffer management algorithms in parallel I/O systems, using a read-once model of block references. This has widespread applicability to key I/O-bound applications such as external merging and concurrent playback of multiple video streams. Two realistic lookahead models, global lookahead and local lookahead, are defined. Algorithms NOM and GREED based on these two forms of lookahead are analyzed for shared buffer and distributed buffer configurations, both of which occur frequently in existing systems. An important aspect of our work is that we show how to implement both the models of lookahead in practice using the simple techniques of forecasting and flushing. Given a -disk parallel I/O system and a globally shared I/O buffer that can hold upto disk blocks, we derive a lower bound of on the competitive ratio of any deterministic online prefetching algorithm with lookahead. NOM is shown to match the lower bound using global -block lookahead. In contrast, using only local lookahead results in an competitive ratio. When the buffer is distributed into portions of blocks each, the algorithm GREED based on local lookahead is shown to be optimal, and NOM is within a constant factor of optimal. Thus we provide a theoretical basis for the intuition that global lookahead is more valuable for prefetching in the case of a shared buffer configuration whereas it is enough to provide local lookahead in case of the distributed configuration. Finally, we analyze the performance of these algorithms for reference strings generated by a uniformly-random stochastic process and we show that they achieve the minimal expected number of I/Os. These results also give bounds on the worst-case expected performance of algorithms which employ randomization in the data layout

    ASP: Adaptive Online Parallel Disk Scheduling

    No full text
    In this work we address the problems of prefetching and I/O scheduling for read-once reference strings in a parallel I/O system. We use the standard parallel disk model with D disks a shared I/O buffer of size M. We design an on-line algorithm ASP (Adaptive Segmented Prefetching) with ML-block lookahead, L 1, and compare its performance to the best on-line algorithm with the same lookahead. We show that for any reference string the number of I/Os done by ASP is with a factor \Theta(C), C = minf

    Pc-opt: Optimal offline prefetching and caching for parallel i/o systems

    No full text
    We address the problem of prefetching and caching in a parallel I/O system and present a new algorithm for parallel disk scheduling. Traditional buffer management algorithms that minimize the number of block misses are substantially suboptimal in a parallel I/O system where multiple I/Os can proceed simultaneously. We show that in the offline case, where a priori knowledge of all the requests is available, PC-OPT performs the minimum number of I/Os to service the given I/O requests. This is the first parallel I/O scheduling algorithm that is provably offline optimal in the parallel disk model. In the online case, we study the context of global L-block lookahead, which gives the buffer management algorithm a lookahead consisting of L distinct requests. We show that the competitive ratio of PC-OPT, with global L-block lookahead, is ðM L þ DÞ, when L M, and ðMD=LÞ, when L>M, where the number of disks is D and buffer size is M

    Optimal Read-Once Parallel Disk Scheduling

    No full text
    An optimal prefetching and I/O scheduling algorithm L-OPT, for parallel I/O systems, using a read-once model of block references is presented. The algorithm uses knowledge of the next L references, L-block lookahead, to create a minimal-length I/O schedule. We show that the competitive ratio of L-OPT is ( p MD=L), L M , which matches the lower bound of any prefetching algorithm with L-block lookahead. Tight bounds for the remaining ranges of lookahead are also presented. In addition we show that L-OPT is the optimal offline algorithm: when the lookahead consists of the entire reference string, it performs the absolute minimum possible number of I/Os. Finally, we show that L-OPT is comparable to the best on-line algorithm with the same amount of lookahead; the ratio of the length of its schedule to the length of the optimal schedule is always within a constant factor of the best possible. Supported in part by the National Science Foundation under grant CCR-9704562 an..

    Analysis of simple randomized buffer management for parallel I/O

    No full text
    Buffer management for a D-disk parallel I/O system is considered in the context of randomized placement of data on the disks. A simple prefetching and caching algorithm PHASE-LRU using bounded lookahead is described and analyzed. It is shown that PHASE-LRU performs an expected number of I/Os that is within a factor (logD/log logD) of the number performed by an optimal off-line algorithm. In contrast, any deterministic buffe

    Parallel-Disk Buffer Management using Randomization

    No full text
    We present an on-line algorithm for prefetching and caching for a multiple-disk parallel I/O system. We introduce the notion of write-back whereby blocks are relocated between disks during the course of the computation. Write-back allows the layout to be altered dynamically to suit access patterns in different parts of the reference string. The algorithm is analyzed using the standard parallel disk model with D parallel disks and a shared I/O buffer of size M blocks. We show that any on-line algorithm with M-block lookahead using deterministic write-back and buffer management policies must have a competitive ratio of \Omega\Gamma D). We therefore present a randomized algorithm, RAND-WB, that uses randomized write-back and attains a competitive ratio of \Theta( p D), the best achievable by any on-line algorithm with M-block lookahead. Simulations show that RAND-WB obtains considerable performance benefit even when the data has just a small amount of re-use

    An Improved Parallel Disk Scheduling Algorithm

    No full text
    We address the problems of prefetching and I/O scheduling for read-once reference strings in a parallel I/O system. Read-once reference strings, in which each block is accessed exactly once, arise naturally in applications like databases and video retrieval. Using the standard parallel disk model with D disks and a shared I/O buffer of size M , we present a novel algorithm, Red-Black Prefetching (RBP), for parallel I/O scheduling. The number of parallel I/Os performed by RBP is within O(D 1=3 ) of the minimum possible. Algorithm RBP is easy to implement and requires computation time linear in the length of the reference string. Through simulation experiments we validated the benefits of RBP over simple greedy prefetching. 1. Introduction Modern applications like multimedia servers, seismic databases, and visualization and graphics need access to large data sets that reside on external storage. The high data access rates demanded by such applications has resulted in the I/O subsyste..
    corecore